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PREFACE 

The three Lectures delivered by the . writer in London 
University on March I2tli, 13th and 14th, 1930^ are here 
presented together in a somewhat extended form. In publishing 
these Lectures I have supplied all such details as had to be left 
out, owing to the limited time at my disposal. In order that the 
present publication may be read by itself, I have also found it 
advisable to include certain details which are already accessible 
in other papers of mine. 

My best thanks are due to the Institute of Actuaries for 
facilitating this publication. 

J. F. STEFFENSEN 

COPENHAGEN 
March 1930 



First Lecture 


' 1 . When, some time ago, I had the honour of receiving an 
invitation to deliver in the University of London a course of 
lectures on a subject connected with Statistics, I felt a certain 
' hesitation about accepting, because I am not so much' a statistician- 
as an actuary. As, however, the subject on which I was' invited to 
lecture was not termed ^'statistics without further qualification, 
but was only to be a subject "connected with^* statistics, I 
thought I might accept after all, as most of what I have written 
is concerned with that borderland between statistics and mathe- 
matics which constitutes actuarial science. I therefore propose 
to give an account of some of the efforts I have made to introduce 
more rigour into certain questions of theoretical statistics and 
actuarial science, drawing my examples from widely different 
sources. It will be convenient to begin with a few remarks about 
the place of mathematics in statistical and actuarial science. 

The first point of view that occurs, to the mind is that mathe- 
matics, even when applied to observed data, is a science that 
investigates the relations which exist between numbers. Observa- 
tions may contradict each other, owing to unavoidable errors 
of observation, but mathematical relations are not allowed to 
contain contradictions. Statistical and actuarial theory must 
therefore always be presented in such a form that the theoretical 
relations or assumptions contain no contradiction. In this first 
lecture I intend to show by examples how we may be led astray 
by neglect of this principle. 

In the second place, mathematics is the proper instrument for 
justifying methods of numerical approximation. Such methods 
frequently originate in practical work where some approximate 
method has been found to produce satisfactory results. But too 
often the computer leaves the matter at that and takes it for 
granted that the results will be equally satisfactory in other cases. 
He treats the problem as a statistical one, while its nature is 
purely mathematical. More or less consciously he obliterates the 
profound difference between interpolation and graduation, and 
combines both into a single calculus of observations. It is 
hardlynn exaggeration to say that it is a universal habit amongst 
actuaries.and^ statisticianstoregard a formula of approximation as 
definitely established when good results have been obtained in 
a few trial cases. In the second lecture we will consider some 
questions of this nature, and also occupy ourselves with the 
/allied, subjectmfmumerical inequalities.' v^./ 


^ gQjyjg recent researches in the 

Thirdly mathematics is employed hv describing facts of 
observation The formulas used in statistics for this purpose are 
often of an entirely empirical nature. But there are also cases 
where theoretical reasons can be given tor believing that one 
formula will fit the facts better than another ; and much work 
can then be saved by choosing from the outset the most suitable- 
formula It is therefore of considerable practical importance to 
investigate the theoretical foundation of formulas derived by 
oneriilation Striking examples of such formulas are the types of 
frequency- Wtions which will be discussed m the third and 

last lecture. 

2 I shall bemn by a critical examination of the notion of 
Biometric Functions. I have dealt with this subject on an earlier 
occasion* but before a mathematical rather than a statistical 
audience, ’so that I feel justified in recapitulating my views here 

and illustrating them with an application. , . , . 

Before going into the objections which can be raised against 
the manner in which these questions are usually dealt with, I 
will rather introduce the biometric functions in a way which does 
not seem open to serious objection. . 

Let us consider a group ot inaividuals or the same age x and 
selected according to the same principle with respect to the other 
essential factors affecting mortality, so that the group may be 
looked upon as an aggregate of repeated observations. Under 
these circumstances w’e may assume the existence of such a 
function w.., continuous for all ages x > o, that Hxdx represents 
the probability for a life aged y of dying between the ages x and 
X + dx The existence of this function— the /orre oj mortality— 
“is a postulate, but one which is supported by the evidence of 
experience; for the causes of death are either constant tor all 
ages (many forms of accidents), or else dependent on the way 
in which the organism develops and finally wears out, and this 
process is of a continuous nature. . r • -n i 

Very little can be said a priori about the function . ferhaps 
the only statement that can be made without consulting mortality 
observations is that there must exist a positive constant e, 
independent of Xy such that 

^^>€>0 {x>o). (i) 

For a lower limit to greater than zero, can at least be 
derived from the probability of dying by accident. The simple 
fact expressed by (i) is, however, as we shall presenJy see, of 
considerable importance. 

* Proceedings of the Sixth Scandinavian Congress of Mathematiciam, 
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, By means of the force of mortality the Other .biometric func- 
tions may be obtained as follows. Let there be 4 persons, alive 
at age a’. The mathematical expectation of death amongst these 
persons ill the interval from a to at + dx is The expected 

number of . living at thC' end of the interval dx is, therefore, 

L — Lf^xdx, , 

T d! 

;,wlieiice', : : = -Dhogl^, . .( 2 ) 

where D ' denotes the operation of differentiation, and Log the 
iiatiiral logarithm. 

IntegTating ( 2 ), we obtain for the probability that a person 
aged X is alive after the time t 

= = e , ( 3 ) 

‘'X 

whence in particular, for = i , 

-- l^^'/xxdx 

Px ^ ( 4 ) 

while the probability of dying within a year is -= i — 

If, in ( 3 ), we put x ^ a and thereafter t x — a, we find 

- hjux dx 

4 = 4. . ..( 5 ) 

3, Before proceeding, let us see what general conclusions can 
be drawn from these results, concerning the nature of the 
biometric functions. 

It follows from ( 3 ) and (i) that, as t increases, the probability 
of surviving, tp^, decreases in a monotonic sense to zero without 
attaining this value for any finite value of if. 

From (i) and ( 5 )— where 4 may be considered as an arbitrary 
constant— it may be concluded that, as x; increases, the function 
4 decreases in a monotonic sense to zero without attaining this 
value for any finite value of x. We may even say something 
about the rapidity of this decrease ; for we have 

4<4e-f”-“’W (:X>a\ ......( 6 ) 

which shows that the decrease is, at least, so rapid that all the 

■ moments v . .. ‘ f®®. 

■ . ■ |. x^'l^dx ■ 

':B^nd:ihe ■ 

rco fco I'oo 

... 4^4;’' 

. J o. J X " . Jx . ■■ ■ 

■are:^BecessariIy, convergent., ■■ 
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The interesting question,: whether "> oo as x ,-->. 00, cannot 
be decided .either by observation or by speculation. We obtain 
from (4), by the theorem of mean value, 

X {x<^<x-\' i). ......(7) 

From this relation, it follows that '-> 00 as -> i and vice versa. 
But whether ~> i as x 00 is impossible to decide. AH that 
ean/be said. is that this assumption, if desired, can be m.ade 
without introducing any contradiction.. 

.. There is ^ every xeason to believe that, above a certain age, the 
function 'As, by (4),^ 


J X 

we have under these circumstances 

fix Log_p^ , ...,..(8^ 

' But, as . -- Log px= - Log (i - q^) 

^ qx "T ^q^^ A b **• j 

it follows from (8) that 

qx ( 9 ) 

provided, only that does not decrease in ,tlie inten^-al from a? to 

As qx Siiid.fXx donot differ, greatly for the ages which are 
of, practical importance, -this, simple inequality, will often be 
,found useful. :, 


, 4 . Another important biome,tric function ,is the expectation of 
v^ ^., defined by 

I tpx^i^ Y i ■ '(^^) 

' ' Jo : i’X J X '■ " 

We may also, by (3), express the expectation of life in terms 
of /u-s, thus 

fco ,1' ' Ji’xdx 

e» = j e dt. ......(ii) 

Inserting the lower limit to ju* by (i), we obtain, on performing 
the integration, 

(12) 


That is, there exists a boundary, independent of x, which e* 
cannot exceed. This result is trivial, but a more important 
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inequality may be proved, if it is assumed that fi^ never decreases 
for X > Xa . In that case, it follows from (i i) that 

e* < [ 

Jo 

o*" ^ *»)> (''3) 

whence, by (9), 

4 <;:— (x>Xo-\-i). (14) 

Instead of e^. it is often sufficient to consider the curtate 
expectation of life defined by ' 

CO J. 00 

= S = 7- 2 4 +f, (15) 

t= I t'X t— I 

which, as 4 is constantly decreasing, is always smaller than 4. 

We evidently have 


px d" Pxpx-hi d" pxpx-\"ipx^z *+■•.«> 

SO that, if pi^ does not increase with x, 

< px^ px^ px^ 



From (15) we immediately find 

^ px (j- ......(17) 

This relation shows that if -> i for a? oo, thend?.^. -> o for 
-> and conversely: if -> o, then -->1. 

It follows that if -> o, then (x^ -> co. The converse pro- 
position may be proved thus: If 00 as co, we may 
choose X so large that [Xx > K for all values of a? above a certain 

limit, and consequently by (i i), ^ for all values of above 

that limit. But K being arbitrary, we have 4 -> o if -> od. 

5 . What the text-books say about the function 4 is as a' rule 
not expressed with sufficient reserve, and is even in many cases 
positively ■ misleading. Without going into- details about the 
various ;^forins of vagueness or inaccuracy J -have : observed, 

I think it, safe to assert that an ordinary student reading a text- 
book for the first time may be led to form the following 
opinions -on the nature of the table of 4 
Assume that 4 persons -are horn, and' that' we follow these 
persons from their birth till their death, the number of those 
who: ^are alive after x ; years being denoted by ' 4 . They - will '■ 
certainly all die within a finite time. There is therefore an 
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“oldest age ” co such that 4, > o for x < co, and 4 = o for x > w. 
No question of convergence arises, for we have simply 

1*00 f(x> 

l^dx — Ixdx, 

the limits . of integration being in reality finite.' If an analytical 
expression is assumed for 4, it may safely be assumed that' 

IS so small that it can be neglected. 

J(ti ' 

Our student is confirmed in this view when he discovers that' 
the table of 4 is stated in integral numbers, commencing with, 
say, 100,000 persons at the lowest age and terminating with o at 
an age about loo years. 

But this point of view contains a fallacy. It is true that any 
given finite number of persons must all die within a finite time 
or, to put it more precisely, the probability that one or more of 
them will survive indefinitely is zero. This follows from the 
above established fact that tpx o when t ~~>co. But the greater 
that number of persons is, the greater will also be the number 
of them who may still be alive at age loo or any other assigned 
age, and it is therefore quite impossible to maintain the existence 
of any definite ‘‘oldest age” o) within which everybody must 
certainly die. Let us, to put it mathematically, assume for a 
moment that co is the upper limit of the age which human life 
can attain. Then we have admitted that it is possible that a 
person may be alive at the age co — 07, where t] is a quantity which 
we may choose small as we please^ for instance equal to one 
second. But at the exact age co, or one second later, that person 
must, according to hypothesis, necessarily be dead. Does any- 
body really believe that there is an age co with this miraculous 
peculiarity?' ' 

It might, however, be argued that in introducing the function 
4 in the way we have advocated above, we only replace one 
monstrosity by another, for we admit the possibility of a person 
being alive at any age, however advanced. Nobody will ever 
believe that a person can live to become 1000 years old, and from 
a practical point of view this may safely be maintained. It is 
practically immaterial whether we say that a person cannot 
attain the age of 1000 years, or that tht prohability of attaining 

that age^ is < an inconceivably small number. But if 

* According to the table as graduated by Makeham’s formula we 
have 

log 4.==, 5-0575047 ; — "''-002557.551; — , IO‘®3^47 oo 7037,' 
whence, for instance, 

■ llOOO . 3 :; 

log— < 
i.IO . 
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theoretical ' clearness can be gained by speaking of extremely 
small probabilities instead of impossibilities, it should certainly 
■be done. Infinity is not a number, but only denotes the absence 
of a boundary, and the non-occurrence in practice of observa- 
tions of a, certain order of magnitude- should not without 
necessity be attributed to a mysterious boundary hidden some- 
where, as , it is always sufficiently accounted for by their ex- 
ceedingly small probabilities, exactly as in' the Theory of Errors. 
Even jn such an everyday subject as the Theory of Interest we 
do not hesitate to make use of infinite durations in cases where 
no boundar}?- can be indicated:, not because anybody believes that 
a pe,rpetuity will really' continue to be paid ad infinitum, but 
because it is a convenient and harmless construction, as the very 
distant payments do not appreciably influence the value of the 
perpetuity. 


6. i\n entirely different point of view from the one we have 
discussed is that in actually constructing a table of 4 it is 
necessary to stop at a certain age w’^hich we may still call co, 
although it has nothing to do with an ‘‘oldest age” in the 
abstract sense of the word. Our will now be determined, not 
by an imagined impossibility of living beyond that age, but by 
considering the error committed in calculating integrals and 
sums to the limit m instead of to infinity. 

It seems reasonable to determine a> in such a way that in 
calculating the expectation of life by the approximate formula 

I 

= y- I l^dx ..... . 8) 

J X 


we obtain at least three reliable decimals in the result. In that 
case we must have 


i 

4 


*00 

l^dx < ‘ 0005 . 


Assuming now that ju.* does not decrease for x> cd, we have 
according to (13) 


/ 

hdx < -*' , 

Jta 


SO that we shall have three correct decimals in 4 provided that 

I 4 

4 P'UD 


< -0005,; 


, '4 > 200o’ ■—■■.. 

n'm 


that is, if 
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It follows from this that we shall have at least three correct 
decimals ^ calculated by' (i8), when x < lOO, if ex) is cal- 
culated b)?- 


2000 — . 
P'ia 


(19) 


In the table as graduated by Makeham’s formula we 

= -0058889 + IO*° 39 -v + 4 0161709 . 


by this and by the expression for 'log 4 given above we find 
106 < < 107. The table of 4 must therefore be prolonged as 

far as 107 if we want the expectation of life with, three correct 
decimals for all ages up to 100 years, neglecting the contribution 
from ages above 107. And — this is material — 4 should be given 
with a sufficient number of figures throughout, say five or six 
significant'^ figures, and not, as at present, finishing off with 
4oO ™ 7s ^lOl “ 3s 4o2 = I. 

There are, of course, other ways of determining a> ; we may, 
for instance, consider sums instead of integrals ; or we may fix oj 
arbitrarily, in wffiich case it would be necessary to investigate 
how many decimals in the expectation of life may be relied upon 
at the various ages. But in all cases the principle remains the 
same, and we need hardly go into further details. 

It is sometimes objected that there is no meaning in stating 
4 and similar functions wdth five significant figures at the highest 
ages, because the observations available there, even after gradua- 
tion, are insufficient for producing this degree of accuracy, and 
also because the table is to be applied to the future which never 
agrees wholly with the past. 

This question is closely connected with the question of the 
purpose of graduation. It may be said in a general way that the 
object of graduation is to replace the rough observations by a 
more smooth series of data ; but the smoothness thus obtained 
is partly lost again, if the last few values of 4 are only stated with 
one or two significant figures. If now we proceed one step 
further and ask why we want the table to be smooth, the answer 
is : Not only, because a smooth table is likely to be closer to the 
truth than an irregular one, but chiefly because \ve want to 
be justified in applying methods of interpolation, numerical 
differentiation and integration, etc., in short mathematical 
inethods, to the table. But this requires that the function 
represented by the table possesses a differential coefficient of 
a certain order, and the simplest way to ensure this is to graduate 


* The '"significant ” figures commence with the first figure that is different 
from zero. 
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the table by an analytical function. Under these circumstances 
there is evidently a need for retaining a not too small number of 
figures, throughout the table, and this number of figures has 
nothing to do with the accuracy of the observations. It depends 
on the use that is to be made of the table, and remains the same 
whether the table is a purely hypothetical one, or has been 
derived from a very large number of observations. 


7« As an application of the principles discussed above, we 
will consider the classical problem : whether it is possible to have 
two different mortality tables producing, for all ages at entry 
and durations, the same policy values for a life assurance with 

annual premiums. 

Denoting, as usual, the life annuity-due by and the policy 
value after years by JAjWehave^ 


= (ao) 

Let p,' be the force of mortality according to another table of 
mortality ; the corresponding life-annuities, etc., will be denoted 
by a-l, etc. 

If, now, we are to have = jFJ, we must, according to 
(20), have 




xi-t 


.(ai) 


for all values of x and t. Denoting by a constant, the condition 
(31) may therefore be written 


It is obvious that k < i, as the expression on the left is always 
positive. But k must as a rule also satisfy another condition. 
Let us assume that the original mortality table has been 
graduated by such a formula (for instance Makeham’s formula) 
that -> 00 as » -> 00. This is, according to paragraph 3, 
equivalent to saying that 5*^ i as x -> 00. We therefore have 
Zx -> I as X 00, as follows from the obvious relation 


ax=i + vpxZx+„ ( 33 ) 

where 0 is the present value of a unit, due one year hence. But 
if Zx I, it follows from (aa) that a^ tends to a limiting value 
that exceeds 1 (as k< 1). For ^ = o would mean that the two 


* Institute of Actuaries’ Text-Book, Part ii, Second Ed. (1902), p. 323 
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■mortality tables, against, hypothesis, were identical, and /e < 
that, could become smaller than unity, . We therefore have 


o < k< 1. 

Let us now put 

■ lim I + ^ ir}> o). .....,(24) 

;a-->oo ' ■ 

In the equation = i + ^ • • • • ••(25) 


we let X 00 and introduce (24), putting v ~ where, f is: 

I “j~ z 

the rate of interest. We then have 


I 4-77 = I + 


I -h rj 


I ~f" ? 


(I 



whence = -- — . (26) 

A mortality table producing such annuity values a^, that (22) 
is satisfied must therefore be a peculiar one, as < i . It is 
easy to construct the table, for we obtain from (25) 





whence, as a! == — 

I - k 

and + 



Z^-l+k 

(27) 

or, eliminating a^j+i by (23) 



ii 


(28) 


This suffices for constructing the table of II . But we still have 
to satisfy ourselves that it is possible to give k such a value, 
comprised between o and i, that all the values of pl obtained 
from (28) are confined to the interv^al from o to i, as otherwise 
they could not represent probabilities. Now, as k is positive, 
Pl is evidently always positive. The condition < i can be 
written ' ■ , 

or by (23) (39) 

This inequality must be satisfied for all values of x, and besides 
we must have o<k < i. It » easy to see that values of * exist 
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satisfying these conditions. We ma)^, for instance, choose ^ so 
that o < k:< vq where q is the lowest value which qx assumes. 

The restrictions on k derived above are by no means observed 
in the literature of the subject. Thus the condition k > o, 
resulting from the behaviour of fix and /x^ at infinity, is dis“ 
regarded in Text-Book^ Ed. 1902, l.c., pp*. 336-338, where a 
transformation of the mortality table is made with a. negative 
value of putting 




although the original table Text-Book graduation) is 

graduated by Makeham’s formula, so that a^. ->1 as a? -> 00. 
The result is that the probabilities become negative above a 
certain age, determined by 2 ix< 1*05. These negative proba- 
bilities have escaped detection, possibly because the probabilities 
PI have only been calculated up to age 95. But even continuation 
beyond that age might have thrown no more light on the matterj 
as the annuity values at the higher ages are very unreliable, 
because the table of 4 is stated in integers (4oo = 4, hot = i > 
hoz = 0), an objectionable practice which we have criticised 
above. 


8. Another statistical problem which is usually treated in 
such a way that it leads to contradictions is the question of 
Presumptive Values of Frequency-'Constants, Let us begin by 
explaining %vherein this problem consists. 

Let there be n repeated observations (z = i, 2, ... ?z). Any 
symmetrical function of all the observations, such as the 
moments about a given point, the semi-invariants, etc., will be 
called a '‘frequency-constant.’’ We denote the rth moment 
about the origin, after division by n, by cr/, that is 

= (30) 

SO that, in particular, 0*0 = i, while Oi is the arithmetical mean of 
the .observations . 

^ Further, we denote by My the moments about the mean, after 
.dividing hy zz,' that is" " 

Mr - ~~ S {Oi - dPf (r > i) (3 1) 

. . ,. ^ Z.“I . 

;For T;==T^we:' define Wj ■== cTj . -V- 

A folding sheet at the end of this publication gives the notation used 
by me : English readers will notice that it differs from that customary in their 
country..,. ^ 


12 , 


SOME RECENT RESEARCHES IN THE 


, The; functions cr,. as well zs are examples of frequency- 
constants. , We may for the moment confine our attention to 
these two classes.. It is well known that they are absolutely 
equivalent: from o*!, 0-3, ... cr^ we may calculate OTi , ... 

and vice versa. Further, that .the first n values of <7,. (or are 
equivalent to the 71 observations which may be calculated from 
them'^'. 

If the number of trials n is allowed to increase indefinitely , the 
values of being, arithmetical means of the 0/, are supposed 
to tend to certain limits which we shall call the true values (in a 
purely mathematical sense) and shall denote by At the same 
time the values of m.,. will, on account of their relations with those 
of a,., tend to their true values niy. 

The question now arises : What approximations shall we use 
instead of the true values, if we do not Imow these a priori but 
only have the observations to go by? Such approximations, if 
any can be found, are what is meant by the expression “pre- 
sumptive values ’’ of frequency-constants . 

The search for presumptive values goes back to no less an 
authority than Gauss, who recommended, de mieux^ to putf 

= (32) 

while no better value than m-^ could be found for Wi-^ . 

As the argument by w^hich (32) is obtained is constantly re- 
curring in deriving various kinds of presumptive values, we may 
as well reproduce it here. 

We have 

= 0-2 — ^ Soj® - E (Sof)^ 

n ^ 

which may be written 

'n) ( 33 ) 

As m. is a function of observations, it may itself be looked 
upon as an observation. The true value of an calculated by m 
observations is, of course, not the same thing as as it 

depends on n. It may be denoted by or, perhaps more 

simply, by E being in fact the mathematical expectation of 
an T/Zg calculated by n observations. 

In calculating E (zzzA we make use of the well-known theorems 
that the expectation of a sum is equal to the sum of the expecta- 

^ Thi£ie, Theory of Observations ^ pp. 23-24. 

t As ?fii> denotes the true value, a new symbol should strictly be introduced 
for denoting the presumptive value. We trust ho%vever that no confusion will 
arise from using the same symbol. 
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tioiis, and that, if the observations are mutually independent, 
the ' expectation of a product is equal to the product of the 
expectations. . 

■ We now obtain from (33), remembering that Oi and Oy are 
independent of each other, as we have assumed i =}=/, 


£ K) = |(i - i) (0-) - J (0,) £ (0,) ; 

but for any given value of i we have E (o/) = , E (of) = ; 

and as the, number of terms in the -first sum- is n, in the second 


/fr\ 

\2j ’ 


we find 


E{m^) 


( I — i ) Mo-a - 

z 

/fl\ 

V ra/ 


U/ 


or 


. (34) 


This is an exact formula, resting on the assumption that we 
have an infinity of sets of observations o,- with n observations in 
each set. If -wt have only one set of observations we must 
instead of E {m^) take , exactly as, if we have only one ob- 
servation Qj , -mt are compelled to use this as an approximation 
to ?! . We thus have the approximate relation 

71 — 1 ^ 

ffl = w 

^ 71 

whence follows at once the Gaussian presumptive value (32), 

In the same way we obtain from (30) 

but E {of) = for any given value of 2, so that E (a^) == i . 

or E{ar)=^ffr- (35) 

Hence, we have the presumptive values 

' ■ ■ ......(36) ■' 

- It follows, in particular, that- ^ . But (32) shows that 

" we' dQ"22ol in general have = ^r- We find, by the same method, 
leaving out the details of the calculation, - , 

£(„,). (3,) 


E{m^) 


[(n^ -3?2 + 3) OT4 + 3 (2M - ......(38) 


'fp. 
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If now , we replace E by , and E (m^) by fn ^ , us,mg, for m.^ 

.the already found presumptive value (32), we liave as far, as^ f?\ 
the presumptive values 


fUr 




fkry 


■■ nit 


" fn2, 


' {n — i) [n — 2) 


r 






l) - 3K + 3) 


m. 


n(n-i) "jJ 


•(39) 


which we shall call ThieW s presumptive values^ because the corre- 
sponding values for semi-invariants (agreeing as far as M3 with 
the above) were first given by Thiele, who has even calculated 
them as far as the 8th semi-in variant'^\ 

9 . We proceed to examine what objections may be raised to 
this way of deriving presumptive values of frequency-constants. 

Confining ourselves to the first two moments, which is 
sufficient for our purpose, we first call attention to the fact that 
the knowledge of ctj and a 2, is absolutely equivalent to the 
knowledge of and ; for from and o-^ we may calculate 
Mi and z/Zs, and vice versa^ by the relations 


Mt 


■Oj 




'■ 7?lr 


^ -.....(ao) , 

0-2 = M 2 + Mi^J ^ 

which evidently also hold for the true values j > ^2 • If > 
for some purposes, we prefer nij and to aj and it is not 
because and M2 contain more information about the observa- 
tions, but because the information is contained in a more 
convenient /om ; thus is an approximation to the square of 
the standard deviation ; is usually written with fewer figures 
than 0-2 ; is easier to calculate ; etc. But this being so, it becomes 
extremely impi^obable that and can tell us something 
about the true values of the first two moments which o-x and 
G2 cannot also tell us. It is therefore a very suspicious fact that 
■ jf try to derive 0^2 from 0-2 by the which is used ; 

for deriving from M3, we arrive at the result that must be 

used unaltered, while instead of we must take ----- m. or 


n — J: 


nu 


Ms 


- , This suspicion is strengthened by the consideration 


* Thiele, Theory of Observations (London, 1903), p- 48. Most of these 
results were given in the first (Danish) edition of that work (1889). 
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that the correction to be added to or 




, is of a smaller 


order of magnitude than the mean error of . For we have 
^2 (wJz) = E {m^-) — [E (mf)Y 


= £ (cTs^ — ZCTjCTi^ + 

which may be reduced to the well-known expression 


so that the mean error of m^, for increasing n is generally of the 


order 




But we may go further and assert that there is an actual 
contradiction present. For, as the presumptive values of Wi ^ , Wi ^ , 
have been obtained by the same method, they must 
be considered equally good. If, however, in the relation 
M3 = 0=3 ~ 0=1^, we insert these presumptive values, we find 

71 

cr^—G 


or 


71—1 

I, which is absurd. 


7n^ 


10 , The contradiction is brought home in a practical way, if 
we assume that the problem is to determine the constants in a 
frequency-function, say, for the- sake of simplicity, Gauss^ law 
of error, by the method of moments which is here identical with 
the method of least squares. Gauss’ formula contains two con- 
stants to be determined by the observations, so that we have to 
use the first two moments for their determination. If we write 
the formula in the form 

jx - g) - 

, ......(41) 


V Ziry 

'we'find,, by the method of moments, 

whence a. 


xydx = a, x^ydx^ y + 

J —CO 

y = If instead of these, we use their 


. 'presumptive, .values ffij and have> 


■ 


's/ ZTrm^nj^it — i) 


^ ......(42) 
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But if we write Gauss’ formula in the form ' ■ 

T , ' (x~aY , 


V27r (^ — a^) 
we have^ by the method of moments, 


e 2(iS-a^), 


■(43) 


'CO 

= xydx ■ 
J —00 


■ a, 


1: 


x^ydx = 


so that a — , '^.=, ^3 . Using for and their presumptive 

values o-j and 0*2, we have 

I 

y = e z (dz - , 


V 271 (o-g — O-/) 


or 


Vz‘ 


zm%. 




(44) 


But the two formulas (44) and (42) are in contradiction to 
each other, and it is not possible for me to see why (42) should 
be preferred to (44). We have in both cases used presumptive 
values, calculated by the same method, for the frequency- 
constants by which the constants of the formula are expressed. 

11 . It might perhaps be thought that the contradiction 
involved in (44) and (42) is not of very great importance, as 

71 

is generally close to unity. To this we would first of all say 

that even if there were no contradiction at all, the trouble of 
investigating formulas for presumptive values should be spared, 
if they do not mean an actual improvement of the result. But 
in order to show the danger of the mode of argument employed, 
we will introduce an arbitrary constant into the problem^ 
enabling us, so to speak, to magnify the contradiction to any 
extent we please. 

Let be a constant to be chosen as we like. Instead of using 
the frequency-constants 7 n-^ we may just as well use dj and, 

'02 defined by' 




■■ 

■■ 


Mj 


% 

■■ 02 — 


•(45) 


These relations show that 9 ^ and 6 ^ contain exactly the same 
information about the observations as and (or o-i and o-a). 
pie relations (45) evidently also hold for the true values 


Matematisk Tidsskrift (1923), pp, 
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Writing 03 in the form 

02 = (l — C)j«3 4- Cffa, 

% — -I 

we have , M (d^) = {i — c) + ctr^, ; 

or, introducing 

^3 == 02 ~ = ^2 + (l — c) 0 '^^, 

£(02) = (i--^-^)02+-^-^“-^^.^ (46) 

besides the obvious relation 

Eie.)-0.. (47) 

ilccording to the general principle, we must, on the left of (47) 
and (46), replace E (di) and E {62} by di and 62 respectively. We 
thus obtain the presumptive values 

^ (48) 

® n — 1 + c J 

Nothing prevents us from -writing Gauss’ formula in the form 

I (x - ay 

e 2(^-ca^), (49) 

V 277 (p — ca^) 

where c is 2. given (arbitrarily chosen) constant, and determining 
the two unknown constants a and ^ by the method of moments. 
Wefind'^i = whence Ml == a, ^2 = § — 

so that, by (45), 0 t = «, 6 z = A Taking for 0 ^ and 02 their pre- 
sumptive values (48), w’-e have 


a =■ Mi , ; 

^ nd2 - cil - c) 0 j^ nm2 
^ n — 1 + c n-- I + c 


• -iso) 


Inserting these in. (49), we 'finally have . 

" I (x-mi)- . 

<y' — = e ' Zmznjin I + c) . •••(S l) ' 

V 'Y' V 27TM3#/(?2 -- 

But this result is ^ obviously absurd. It-'is' equivalent to taking 
as presumptive ;Value for M 



l8 : ^ ' ' SOME, RECENT' RESEARCHES IN THE 

with an arbitrary value of c. Choosing for c a sufficiently large 
positive value, we get for a value as small as we please; and" 
choosing for c a sufficiently large negative value we get such an 
absurdity as a negative value of the square of a standard 
deviation, besides imaginary values for j;. 

12 . Another definition of the presumptive value of a fre- 
quency-constant has been given by Tschuprow^. Let F be the 
true value of a function of observations, and let G be such a 
function of the' observations that the mathematical expectation 
of G is equal to F\ then G is the presumptive value of F\ 
Hence we have in symbolsf 

E{G)^F. ' ^ ......(S 3 ) 

Tschuprow*s presumptive values of the first four moments about 
the mean arej 

Wl^ — Mi 

™ n 


+ 3) ”<4 - 3 ( 2 » - 3 ) V] 
( 54 ) 

Of these, the first three agree with Thiele’s presumptive 
values (39), while there is a difference in the fourth. 

The difference between Thiele’s and Tschuprow’s methods 
may briefly be characterized thus : 

According to Thiele, we must first calculate E {nir) as a 
function of that is 

E{mr)^ip(m,,m^, ...Mr). (55) 

Dropping E^ we have the presumptive equations 

Ms, ... M,.), (56) 

from which the presumptive values Mi , M2, ... can be found in 
succession for r = i, 2, — 

* A. A. Tschuprow, Gruitdbegrijfe und Grundprohleme der Korrelaiions- 
theorie^ pp. 74”75; see also Nordisk Statistisk Tidskrift (1924), pp. 468-472. 

t Two questions ought, strictly, to be cleared up, before proceeding with 
Tschuprow’s method : 

1. Is there always a solution to the equation (53)? 

2. Can there be more than one solution? 

For the following simple applications these questions are, however, of no 
importance. 

J A number of similar results have been given quite recently by R. A. 
Fisher, “Moments and Product Moments of Sampling Distributions,” 
Proceedings of the London Mathematical Society, Ser. 2, voL xxx, Part 3, 
pp,': 199-23:8..^ 




■ 
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According to Tschuprow, we must find' such a ■ function 


6 (OTj . my) that 

E{4>) = mr. ( 57 ) 

Dropping we have the presumptive values 

= ...Mr), ......( 58 ) 


There seems little to choose between the two methods, 
especially as the first three presumptive values are identical in 
the' two cases. The objections raised above, founded on the' 
presumptive value for evidently apply equalty to both 
'methods. 

13* Contradictions of a different nature from those discussed 
above have been detected' by Mr N. P. Bertelsen^. Presumptive 
values of frequency-constants are obtained by introducing 
certain modifications into the frequency-constants obtained 
directly from, the observations. These modifications must not, 
of course, be of such a nature that some or all of the observa- 
tions to which the modified frequency-constants correspond are 
imaginary. The question whether the observations corresponding 
to given values of the frequency-constants are real, is the so- 
called question of the compatibility of the frequency-constants, 
A particularly simple case of incompatibility has already been 
mentioned above in connection with (52), as no real observations 
can produce a negative square of the standard deviation; even 
the value zero is only possible if all the observations have the 
same value. 

Mr Bertelsen^s analysis is too complicated to find a place here ; 
we therefore confine ourselves to stating one of his results . 

Let denote the rth semi-invariant, then the presumptive 
values of the first four semi-invariants are, in Tschuprow’s 

sense, 

jCXj = 111 ' , 

■■ . L . n ■ 

, . ' , K 

. ^3,; __ 2) 

— •••(S9) 

Mr Bertelsen ; proves that while the three first of these pre- 
sumptive values ; are ' compatible, the '' fourth is ■ not always 
compatible with' the'uthers. ' ' ' ■ 

^ Skandinavisk Aktuarietidskrift pp. 129-156. 
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14. It appears thus that neither of the two systems of 
presumptive values of frequency-constants which have so far 
been proposed, respectively by Thiele and Tschuprow, is free 
from contradictions, and that a strong case can be made even 

ft 

against the time-honoured Gaussian formula . If^ 

on the Other hand, we use the uncorrected cr,., /x,., etc., as 

the best available approximations to ay, etc., we are at 

least sure that no contradictions can ever be met with. In order s 
to avoid contradictions, Tschuprow has suggested'^' that pre- 
sumptive values ought never to be used for ordinary calculations ; 
but one may well ask, why should we calculate presumptive 
values, if we may not use them in further calculations, e.g. for 
measuring the probability of errors, for determining the 
constants in frequency-functions, and so on. 

^ Norilsk Statistkk Tidskrift (1924), pp. 469-470. 


Second Lecture 


1 . Let us assume that a table oi f {x) like that shown, in the 
two first columns below is put before a computer wfith the 
request to calculate the value oif{x) for x = 4-5. 

fix) 8 S’ 85 8“ 35 8« 


99833 

198669 

395520 

3S94IS 

479425 

564642 

644217 


98836 

96851 

93898 

90007 

85217 

79575 

73139 


- 1985 

- 2953 

-3891 

- 4790 

- 5642 
~ 6436 


-968 

- 938 

- 899 

- 852 

- 794 


30 

39 

47 

58 


— I 

+ 3 


717356 

(Note. In the table all values are multiplied by to®.) 


Not knowing anything about the function except what he can 
see by the table itself, he will begin by forming the difference- 
table as shown, and having satisfied himself that the differences 
decrease very rapidly, it is more than likely that he will have 
no hesitation in interpolating as if the fifth or sixth difference 
were a constant, and stating the result as/ (4-5) = -434965. All 
that he knows is, however, that this result is correct, if f {x) is 
a polynomial of not more than the sixth degree, an hypothesis 
which can only be approximately true here, as the two values of 
the sixth difference obtainable are not equal to each other. The 
kind of certainty our computer possesses for the correctness of 
his result is therefore, strictly speaking, of a statistical nature : 
he has often interpolated under similar circumstances, and has 
probably never had reason to regret it. 

If now we inform our computer that the function tabulated is 


f{=c)- 


■ sin 


10 


■(I) 


he will see no reason to alter his views ; but if we tell him that 
it is 




= sm h smTTX, 

10 


.(2) 


in which case/ (4-5) = 1-434965, or that it is 
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in which case / ( 4 - 5 ) = 4*934965 > he will no doubt protest, and. 
probably say that his method of interpolation is only intended 
for well-behaved functions. ' Yet the functions '(2) and (3) are, 
in their way, just as inoffensive' as (i), which shows that we 
cannot, without moving in a circle, define a ‘‘well-behaved” 
function as one to which our methods of interpolation apply. 

The true source of the difficulty is that if we know nothing 
more about the function than what is given in the table, the 
function is undefined for any argument intermediate between 
those stated in the table. Therefore, if we insert a perfectly 
arbitrary value of/ {x) as corresponding to such an intermediate 
argument, no disagreement arises with the information con- 
tained in the table. 

In order to obtain an approximation to the value of the 
function at such a place, it is not, however, necessary to know 
all about the function. As is shown in the text-books on inter- 
polation*, it is sufficient to possess limits between which the 
derivate of a certain order is situated. 

It follows from the preceding considerations that interpolation 
may be performed with two quite different objects in view which 
are, unfortunately, seldom kept sufficiently apart by practical 
computers. 

In the first place, the object of an interpolation may be to find 
the value of a function for a certain value of the argument, the 
function being tabulated for certain other arguments, and 
defined, though not tabulated, for the intermediate arguments. 
This is a problem of approximation. 

Secondly, the object of an interpolation may be to fill up, in 
a reasonable way, a gap in a given series of functional values, for 
a value of the argument where the function w not defined. This 
is a process which is much more akin to graduation than to 
interpolation and ought really to be called by a different name; 
W’'e suggest the name intercalation. If, in such a case, the usual 
interpolation formulas are employed, they serve to define the 
function, not to approximate to it. We are here on hypothetical 
ground ; we are'more or less at liberty to accept or to reject the 
result of the intercalation on the ground of common-sense 
considerations, and there is no meaning in speaking of a greater 
or lesser “accuracy” of the result; “plausibility” would be the 
correct word. This is the case in a greats many statistical applica- 
tions of in terpolatioh-formulas. 

It is very unfortunate that the processes of interpolation and 
intercalation, because they happen to make use of the same 
mathematical instrument, have become mixed up in an almost 
* See, for instance, J, F. Steffensen: Interpolation (Baltimore, 1927). 
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inextricable way in most text-books and papers dealing with' 
interpolation. But it is only fair to admit that the temptation has 
often been considerable. 'As is shown in the theory of inter- 
polation, the accuracy of an interpolation depends on the 
remainder term of the formula applied, and this remainder term 
' can in a great many cases be presented in the' form 

R = Kf(n)(^). ..(4) 

But simple as this form is, it is often a difficult mathematical 
problem to derive sufficiently narrow numerical limits therefrom. 
And at a time when such numerical limits are only known in a 
comparatively small number of cases, the temptation is great to 
treat the problem throughout as a problem of intercalation. Thus 
it happens that even tables of simple and fundamental functions 
are often constructed without giving a thought to the remainder 
term, and yet the results are presented as if they were mathe- 
matically proved facts. 

In my opinion, the most urgent problem that has to be faced 
by those who deal with numerical approximations, is to find 
suitable limits to (4) for the functions with which they are 
particularly concerned. As an example, I proceed to give a brief 
account of how this can be done for a function which is of 
extreme importance in actuarial science. 

2 . Let us assume that the force of mortality has been 
graduated by Makeham’s formula 

(S) 

so that the life table is represented by 

y , ( 6 ) 

where the constant k may be chosen arbitrarily. If we put 



7 . 

'4' assumes the form 

k-Ke y , ( 8 ) 

where .K’ denotes , another constant. ' ' 

As = y , it is seen that to find the nth derivate of 4 
ax 

is essentially the same thing as finding the ;2th derivate of the 

function where 6 denotes a constant. Let us first consider 

;,this,'probIem'^. , 

^ Skafidinavisk Aktuarietidskrift (1928), pp. 75-97. 


24 SOME RECENT RESEARCHES IN THE 

3 » If the function 

is ditferentiated times, we obtain <f> (z) multiplied by a 
polynomial of degree n in and d. Denoting this polynomial' 
by Gn 0)j we have 

( 10 ) 

Now, by Taylor’s tlieorem 

M = Q fl • 


whence, by (10) and (9), 
(ji + t) 






Putting 

w^e therefore have 


r 


.(II) 


cu 

C0t-(e‘-l)«= s G„{(,e); 




.(12) 


The function on the left of this equation is, therefore, the 
generatmg function for the polynomials G„ (^, d). These are 
completely determined either by (10) or by (iz). 

The polynomials G„ (^, d) possess many interesting properties 
and deserve to be studied for their own sake. Here we content 
ourselves with deriving certain relations which are of importance 
for the following applications. 

By diflFerentiation of (10) we obtain, as {z) = (0 •— e-) (j> (< 3 ), 
^ ^ 0 ) + (0 e^) Gn {e^ 6)] , 

and from this, by (10), 

G„^., (.-% 0) = D,G, (e^ 0) + (0 G„ {e^ 0 ) ; 
or, as D. = 

G,+, (S, 0 ) - CD^GniL e) + i 0 - Q Gn(l, 0 ). ...(13) 

This is a recurrence formula which enables us to calculate the 
polynomials G„ (^, 0 ) in succession, the initial value Go (4, < 9 ) = i 
resulting immediately from (10). 

It is now easy to prove that 


(?«(£, 0)= S (- 


•(14) 


where the difference-symbol A acts on 9 . For this formula is 
valid for M = o; and being valid for any particular value of n, it 
is proved by induction, by means of (13), that it is also valid for 
the following value. 
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An , important relation is obtained by differentiating (12) with 
respect to We find 


S D^GndJ) 






ft 


{Gn{Le)^Gn{Le-ri)Y 


sothat D^Gn{Le)^Gn{lJ)-GnY:,d+i) ..(is) 

If we compare this formula with (13), w^e find that 

= (x6)^ 

By means of this formula we may prove that, in factorial 
notation, 

|G«(?.0)1 <(Ki + i^l+«-iy”> ( 17 ) 

For this formula is valid for n = o; and being valid for some 
particular value of n, we find for the following one, by (16), 


G«+r (L e)\ 


<101 


+ n — 

(I ^ I + I 0 + l| + K- l)(«> 
(I M + [01 + «)(«) 


l^l(Ki + 

+ 

(UI + 

4 “ 

<(K1+10| + «)W(K| + |0|); 

or |G„+.(^,0)| <(KI + i0l + «)'”n 

so that (17) is also valid for the following value of ?i. 

From (17) we may derive a more convenient, though less 
close, inequality. As the geometrical mean of positive quantities 
is not larger than the arithmetical mean, we have, for a > o, 


[a (^1 + l) ... {p- + — l)] ^ + (£2 + l) {u H — ' I,)] 


or [a (a + i) ... (a + » — i)] < « + , 

that is {a + n — < (a -f {a>o). 

Applying thisto (17), we find 

|G„(C, 0 )| <(Ki-i^l+^)" (^8) 

Although the limits to Gn ( 4 ', 0) resulting from this inequality 
are very rough, it may occasionally, as we shall see, render good 

;service.;'v 44 v^^:'''':'':'^ 
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4 . Returning to the actuarial applications, we now obtain 
immediately from (8), by (lo), 

or, by (ii), (7) and ( 5 ), 

B 


I- 


- gr* = 


A 
y 

[J-x—A 


{19) 


(20) 


y y 

Further, let S be the force of interest, and Z),r the commutation 
function* ^ ..7 

Z)*=e-“4. (at) 

A orlance at ( 6 ) shows that, in a table graduated by hlakeham s 
formula, we pass from 4 to D, by replacing ^ by ^ + §. As 
I thus remains unchanged, we have, by (19)1 

where ^ has the value (20) , and 

y4+8 


e- 


..(22) 

••(23) 


Putting, as usual, for the life-annuities 

a*= Ts 4 »,, «»= rv Bxdx, 

Ux ^ ^ ^ 

the relation between a^, and Ux follows from Euler s summation 
formula"!* which, with infinity as upper limit of integration and 

summation, may be written 

J“/(f) dt = S/(r) - \f{o) + 'S (o) ^ -S. -(24) 

i?= (25) 

Jo ’ 

It is here assumed that (°°/(t) dt and 'Zfis) are convergent, 

and that (k) o as A -> co (f = 1,2,...,^— t)- The 

convergence of the remainder term follows from these con- 
ditions. . , , . r J 

If we put/(t) = !>*+(, these conditions are clearly satished 

when the table of mortality is graduated by Makeham’s formula. 

* The actuarial function Dx must, of course, not be confounded with the 
symbol of differentiation employed above, 
t Interpolation, § 14, formulas (9) and (10). 
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We thus obtain 

«* = a. - I + £ y — ' (C, e) + i?, . . .(36) 


where 


R 


(2r) ! ' ») dt. (27) 

In writing ‘Cx+t instead of ^ we indicate that x has been replaced 
by 'X "f i in (20). 

5 « We may now by means of the inequality (18) find limits 
to the remainder term (27) . 

It follo-ws from (21), (23) and (20) that we may choose the. 
arbitrary constant k in (6), so that 

......(28) 

Inserting this in (27), noting that^ 

1 AtB„{o) I < (2 - 2‘-^0 I i , (29) 

we have 

o oi— ar foo 

R I < I I I G,S,e) I dx; 


dl 


(2r)! 


or, as dx = ^, by (i8), I being in our case positive and d 
negative. 


' i? | <y"’-- 


2 — 2^ 


(zr) 


— 2r foo 

p- 1 jS,, I ' e-i{^-6+r~ i)- d^. 

• J c 

(30) 

But, by the Theorem of Mean Value, 

roo roo 

and the maximum value of — 0 -f r — is (zr)^^* 

Hence, 


Inserting this' in (30) we have, finally, 


r~0~i 


10|- 


i?i<y— ^(2-3‘— )Q 


(2r)\ 




ei-r-0-i^ 


•(31) 


* Interpolation^ § 13, article 141 and formula (20). 



The factor may be replaced by the slightly higher factor 
and for practical purposes it is preferable to put — 0 = k, so that 

y"- (32) 

Further, we have, by (20) and (23), 

d — ^ + S 

y- 

Finally, as the term - ^ is generally very small, we 

include it in the remainder term, and write the formula^ 
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6. As an application of (26) and (31), let us take r = 3. 
then find 


We 




— h r?, 


I < 243 YL gf-e- 3 -s 

I 80 0 


l^x 


+ R, 


(Zx — Rv 


; e?+»-3-S. 

720 3/C 

It should be noted that the limits to the error given by (34) 
are, for the values occurring in practice, considerably closer 
than those obtained simply by putting r = 2 in (31), that is 

j ^ I 

The inequality (34) should, of course, not be applied in each 
individual case, but may with profit be employed for solving 
questions of a more general nature. For instance, it is of 
importance to know up to what age we may, for a given table, 
rely on the first three decimals of calculated by (33)- We 
must then, by trials, find the value of ^ for which the right-hand 
side of (34) equals -0005. In the table, to take an example, 
we have 

A = -005888861, y = -08980082, 

* An inequality of a similar character to (34) has been found by W. Friedli 
by a different method. See Mitteilungen der Vereinigung schweizerischer 
Ver sicker ungsmathematiker, 13. Heft (19x8), p. 267. 
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6. As an application of (26) and (31), let us take r ^ 
then find 


3. We 


~~~ 2. V 


l + y 


e~t a-ey-^ 


12 


720 


+ R, 


I < ^43 Vt ^i-e-rs 
I ^ 80 0 


The factor ^ may be replaced by the slightly higher factor j, 
and for practical purposes it is preferable to put — 6 = k, so that 

y-- -(32) 

Further, we have, by (20) and (23), 

^12 12 * 


Finally, as the term y' 


,3 : 


ey - 1 


720 


is generally very small, we 


include it in the remainder term, and wTite the formula^ 


12 


R. 


R I <y- 


.3 + Ky 


720 


e^+K-3'S 

3 x: 


-•••(33) 

......(34) 


It should be noted that the limits to the error given by (34) 
are, for the values occurring in practice, considerably closer 
than those obtained simply by putting r = 2 in (31), that is 

2 y^ 


\ 3^ 


-3-S 


The inequality (34) should, of course, not be applied in each 
individual case, but may with profit be employed for solving 
questions of a more general nature. For instance, it is of 
importance to know up to what age we may, for a given table, 
rely on the first three decimals of calculated by (33). We 
must then, by trials, find the value of ^ for which the right-hand 
side of (34) equals *0005. In the 0-^^s) table, to take an example, 
we have 

A “ *005888861, y = *08980082, 


* An inequality of a similar character to (34) has been found by W. Friedli 
by a different method. See Mitteilurtgen der Verdnigung schweizerischer 
Versichenmgsmatheniatiker, 13. Heft (1918), p. 267. 
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SO thats at 4 per cent, interest, k == *50233.’ As the first term in 
(34) is generally' small in comparison with the second, we may 
begin by putting 

^^4-«~3-5 = .0005, 

3 ^ 

whence ^ = 5*555. ■ This value is, a little too' large. Trying 
I = 5*4, we get [ R ] < *00047,' taking account of both terms in 
(34). This value of t corresponds to ■ 

~ yC 4" = *49^^? 

or, as is seen by a glance at the table of an age between 
94 and 95. 

The result is that if we calculate by the usual approximate 
formula (33), we obtain at least three correct decimals in the 
result up to age 94, in the case of the 4 per cent, table. 

7. I shall not on this occasion go further into the question 
of numerical approximation, a subject which covers a con- 
siderable ground and cannot be adequately dealt with in a few 
lectures. But I think we may with profit cast a glance at certain 
simple inequalities which may often be of use to the actuary or 
statistician in their estimates. It is not yet sufficiently recognized 
that a considerable number of special results which have been 
derived by more or less heterogeneous methods, are all included, 
as particular cases, in simple inequalities of a very general and 
yet quite elementary nature. Such inequalities, between any 
number of functions, may often be derived by a very simple, 
almost intuitive, principle which I have frequently found useful, 
and of which I proceed to make a novel application. 

Let, for instance, / (^) and (/> (f) be two positive and non- 
mcreasing functions, and let^ {t) be z positive function. Then we 
have, if a < fjL < 

> ••••••( 35 ) 

'2cg{v) 

a a 

for both sides of this inequality represent weighted means of 
/ (z^), but on the left, larger weights have been attributed to the 
larger values of /(^), as the values left out, or /(/x-f- i), 
/(/X + 2), • • • / (/3), may be considered as included with infinitely 
small weights. 

* See, for instance, The Journal of the Institute of Actuaries, vol. Li, 
•pp..a74r-375. 
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Asa rule the sign > in (35) may be replaced by >, as equality 
only occurs in certain limiting cases, for instance when ^ (t) is 
a constant and fi — or when / (v) is a constant. 

A similar inequality holds for integrals; let /(t), ^ (if) and 
g (t) have the same properties as before, and let a < r < b; then 


±0 > 


Ja Ja 


......( 3 : 6 ) 


This inequality may be established independently by a similar 
reasoning, or may be looked upon as a limiting case of (35). 

The proof we have given of (35) is really intuitive; but it is 
easy to give an arithmetical proof. Nothing prevents us from 
putting <f> (pi + i), (i^ + ^ (/ 3 ) equal to zero and taking 

the limits of summation as a and jS ever5rv\diere. We then have 
to prove that the difference 

^i>iv)f(y)g{y)-^g(v)-img{v).if{^)g(v) 

a a a a 

cannot be negative. Performing the multiplications, we get 
terms of the form 

4 >(n)f{n)g(n) .g(m) -<f>(n)g{ 7 i) .f(m)g{m) 
which vanish if n == m. If n 4= we may assume n < m and 
add another term, where w and m have been exchanged. We then 
have to prove that the expression 

<f>in)fin)g(n).g(m) + ^(m)fim)g(m).g(n)l 
-<i>{n)g{n)J{m)g{m)- 4 {m)g{m).f(fi)g{n)\ 
cannot be negative. But this expression can be written in the 
form 

^ (m) [/(«) -/(ot)] [9!. (w) - ^ (ffi)] 

which, as n < r«, cannot assume negative values. 

Simple, as the inequalities (35) and (36) are, they are by no 
means trivial. Thus, putting i- (v) =1, = jS, we obtain from 

(35) Tchebychefs celebrated inequality 

and from (36), putting r^b, the corresponding 

inequality for integrals 

f& T ib fh 


(37) 


(j> {t)f (t) dt > , -- ,j, (t) dt . f(t) dt, 

J a ^ J a J a 


(38) 
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botb. of which are valid if <f> and/ are positive and non-increasing; 
functions. , . ^ 

The inequalities (35) and (36) are closely related to certain 
■other inequalities*, and it may be proved that they hold on 
assumptions' which are less restricted than those made here. 
I believe, however, that the forms suggested above are preferable 
from a practical point of view, because the intuitive principle by 
which they are obtained makes it easy to derive them' and, 
besides, lends itself to generalizations, e.g. by introducing more 
functions. 


8* Turning our attention to the applications, we shall first, 
in (36), put/(i?) — ¥ — r, assuming r > o and o < a <b. We 
then get 


•(39) 


I' fg{t)dt [ rg{t)^{t)dt 

J a ^ {a 

f g(t)dt [ g (t) <jy (t) dt 

Ja da 

This is an inequality between momefits which is sometimes 
useful. Thus, putting <2=0, oo,t=oo, = — • Vx^rt > 

we find 

[ tr^dridt [ tVx+t<i>{t)dt 

J_o ^ d o 

I V xdrtdt I r dt 

> 

^'xdirt 4 * (0 dt 


or 




px-\-t 4 ^ (0 di 


Now,if pt^c+fdoes not decrease fori >0, we mayput /(i) = 


. and thus' obtain 


Px-A-t 


or 


fco 

j tlx-\-tdt 

J Ix’Artdt 

fco 

xlxdx 
lx 


■ + > . 

Ixdx 

■ ■■ ■ lx . ■ . 

provided only that px does not decrease for increasing x, 
Skandiiiavisk Aktuarietidskrift (1925), p. I37« 


(40) 
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. This formula has been proved by different methods and on 
various assumptions as to the mortality table by L. v. Bort- 
kiewicz* * * § , K. Pearsonf and E. ,J. GumbelJ. I have shown 
elsewhere § that the condition for t > may be 

replaced by the less restricted condition < 4 fo^r t > o. 
More generally, we find, putting a = 5 = 00, r = co, 


g{t)=- 

•D X-Tty ^ (f) ? 

"r 0 



rco j'co 

fD^+tdt <rax\ f-^D^+tdt, 

Jo Jo 

......( 41 ) 

which 

may also be written 



F Ux ^ Ux * F ^ €Lx ^ 

(42) 


denoting by the annuity of order r 4- i, 


(43) 

By repeated application of (42) we get 

Fa^ < (44) 

These results may be used for finding limits to the remainder 
term if we calculate the annuity-value at a different force of 
interest by developing in powers of the difference betw^een the 
two forces of interest. 

Corresponding results may be obtained from (35) in the dis* 
continuous case. 

9 , Further, let zj be a rate of interest greater than f, and let 
us put 

/ I £ S J/ j 

/W = (i+7j <^W = 4+., «=i, 

We then obtain, by (35), a result which may be written 

{ii>i), ( 45 ) 

where a\-^ denotes the temporary life-annuity, the annuity- 
certain, at the rate of interest ii . 

* Die mittlere Lebensdauer, Jena, 1893, p. 77. 

t Biometrika, vol. xvi, p. 297. 

t Ibid. vol. xvn, p. 173. 

§ L.c. p. 14s. 
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If, on the Other hand, , we, put 

^(v) = 4+., /(v) = 4,4.„, ^(v) = (i + *)-', a=i, ^ = iJL = n, 

we find ......(46) 

This kind of result may be varied in many ways,' for 'instance 
by introducing more lives, by' considering continuous , annuity 
values, etc., 

10 , As another example of the application of our inequalities, 
let us consider the policy value of an endowment assurance; If 
the sum insured is unity, the age at entry x, the original duration 
then the value of the policy after t years is^ 




•(47) 


_ t : 1 | 

where denotes the temporary life annuity-due. It is assumed 
in this formula that the premiums are annual, and that the sum 
insured becomes payable at the end of the policy year of death, or 
at the expiration of n years if the person insured be then alive. 

Now let Vi be the corresponding policy value at the rate of 
interest ii. It is, then, easy to prove that Vi cannot be smaller 
than F, if ti < f, that is 

V < Fi {ti < t), (48) 

provided only that does not decrease for increasing x. 

For if this condition is satisfied, the function <j> (v) = 

IS a non-mcreasing function, as is seen on examining (v). If 
we put, further, 

and a = Oy — t— i, all the conditions assumed 

in (35) are satisfied, and we find 

which is the same thing as (48). 

In a similar way we obtain froni (36), on the same condition 
as regards and denoting by V the policy value in the case of 
continuous payment, 

F<Fi (8r<S), (49) 


^ Institute of Actuaries* Text-Book^ Part ii, Second Ed. (190^), p. 351. 
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11. As a last application of ( 3 S)> kt us consider the 9^ 
oaid-up policy that can be granted in the case of a puie endow- 
mei with aiiual premiums, if the payment of premiums is 

the term of the endowment^, and 
letfte?^um btKble for r years. Then *e free pohcy per 
unit is, after ^ years* 

* + «-! 
s D* 

(t<r<n). -(so) 


3C+r— I 

s D* 


In practice, this free policy is sometimes calculated as - , an 

irrational rule which, besides simplicity, has the only (doubtful) 
advantage that the policy-holders imagine that they understand 
it! On the other hand, it is safe to use from the point of view of 
the company, as it can be proved by (35) that the resulting value 

cannot exceed the correct value, that IS 

w>-. (s^) 

r 

For this result follows immediately from (35), putting 
^(^,)=^(v)=i, /(v) = Z)*+v, a = o, = ^ = 


* TexuBook, p, 356 - 


Third Lecture 


■ ; 1. In this last lecture I shall deal with the theoretical, 
foundation of 'certain types of frequency-functions. 

If a .statistical variable ^ .can assume certain values Xi'.with 
the corresponding probabilities / then / (^) is called the 
frequency-function of the argument x. Assuming the arguments 
Xi to be distinct, / (^") is a discontinuous frequency-function. 

But there are cases where the statistical variable can assume 
any value belonging to a certain interval. If, in that case, there 
exists a function /(x) such th 2 i.tf{x) dx is the probability of a 
result situated between x and ^ + dx^ we call / (^) a continuous 
frequency-function of the argument x. 

Cases may of course be imagined where x can assume any 
value belonging to a certain interval and, besides, certain isolated 
values ; but as such cases are of no practical importance, we 
confine our attention to the two main classes, the discontinuous 
and the continuous frequency-function. 

The definition itself tells us very little about the nature of 
f{x). It tells us, in fact, only that f{x) must be positive, and 
that, in the discontinuous case, S/(^) = i, in the continuous 
case lf{x)dx^ i, the summation, or integration, being ex- 
tended to the whole field of possible values of x. 

But in our search for suitable types of frequency-functions 
we are greatly aided by the consideration that, according to the 
above definition, a frequency-function (in the continuous case 
after multiplication by dx) is nothing but probability considered 
as a function of a parameter. It is the lasting merit of Professor 
Karl Pearson of the University of London to have pointed out 
convincingly that the natural source of practically useful types of 
frequency-functions is the elementary calculus of probabilities. 
What nature does in producing a new individual, practically 
comes to the same thing as drawing from various urns and mixing 
up the results. This explains the great success of Pearson’s types. 
Hence we may also as a general rule presume that a product of 
elementary probabilities is more likely to yield a practically 
useful frequency-function than a sum of such probabilities. 

As an example of what I have in view, let us derive Pearson’s 
First Main Type (Type I) by the method which seems to me the 
most convincing one*^. 

Let us assume that three numbers a, x and b are given, 

^ The names "Main Type*' and "Transition Type” seem due to Mr W. 
Palin Eiderton; see Frequency-Curves and Correlation f Second Ed. pp. 42-4$. 
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a<x<b, and that we are making experiments, the results of 
which are confined to the interval from a to b in such a way that 

all results between these limits are equally likely to occur. _It 
is very easv to realize these conditions in practiw: we may, for 

ZZZrht tho mtervzl b - a be represented by jhe circum- 

Sfif a roulett. on which the length « - o has been marked 

off. The probability of a result between a and a: is, then, - , 

1) - X 

and the probability of a result between x and h is 

“fS ns’‘nS; make a empound experbmit consisting of k 
indiridual experiments. The probability of obtatnmg r results 
between a and x, and the remainder between x and b, u 

~ . 

V j — a) \h — a) 

If we add one more experiment to the compound experiment, 
demanding that one of the ^ + i results shail fall be ’vee 
X + dx, aifd the remaining k as betore, the above probability 

must be multiplied by (* + i) . as the last added result 

can occur at any of the ^ + i places in the sequence. ^ 

The probability of the compound result is, therefore. 


b - x\^-'’ j 
dx, 


whence, if we write for abbreviation 

Pearson’s First Main Type 

/(x) = H (x - ay {b - x)*-” {a <x<b). 

If we look upon a, b,kaxidv as constants characterizing the 
compound observation, it is easy to indicate what must be 
considered as the observed value of the variable x. We make a 
compound experiment consisting of ft + i individual experi- 
ments, the results of which we set off on the line ftom a to b. 
The observed value of x is, then, the (v + i)th point from the 

- left. ■ ^ ;; • 

We have derived Pearson’s First Main Type on the assumption 
that ft and v are integers. The extension to non-integral, positive 
values of ft and V does not present any serious difficulty, as it is 
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really only a . question' ' of interpolating ' between frequency- 
functions with integral values of these constants. In this case, 

the binomial factor is, of course, interpreted by Gamma- 

'.functions.' ... .■ ■ 

Pearson’s Third Main Type VI) is nothing but (3), 

interpreted differently , and' is' obtained' if we take b instead of x 
as variable and determine the constant factor so that the total 
probability is unity. We need not go into details. 

But it seems more difEcult to connect Pearson’s Second Main 
Type (Type IV) with elementary probabilities. If we put 

:A 


a = 
b-- 




p== — y + - 

v=-Y-lX V- ij 


•(3) 


we obtain from (3) Pearson’s Second Main Type 


fix) = K 


1 + 




-y 


-A arc tg - 




M) 


From a purely mathematical point of view the introduction of 
this type is, therefore, quite natural and simple. But the con- 
nection with experimental facts is lost by thus admitting complex 
values of constants which were assumed to be real. It is easy by 
means of a roulette arrangement to illustrate the probabilities 

I ^ x — a , , I dx 

p = ~ arc tg and dp - 
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I + 


fx — a'v 

\~F~) 


•2^18 ’ 


and ( 4 ) may be built up by means of these probabilities ; but this 
proceeding seems too artificial to carry conviction, and I hope 
that someone will be able to make a better suggestion. 


2. To expatiate, on this occasion, on the work of the English 
School of biometricians founded and headed by Karl Pearson 
would, however, be ‘'carrying coals to Newcastle.” The chief 
object of this lecture is to deal with the point of view of the 
Continental School, a subject towards which I confess that I feel 
less attracted, but on which I have, nevertheless, more to say. 

A number of continental writers, amongst them Thiele, Bruns 
and Charlier, while in England Edgeworth has represented a 
similar point of view, look upon the question of describing 
statistical distributions as a question of development in a series. 
In physics, trigonometric series are often used with remarkable 
success for developing functions of a more or less unknown 
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nature. Why not attempt a similp thing in statistics? But as, 
in the case of frequency-distributions, cos nx &nd sin nx do not 
look very promising as elements in such series, search must be 
made for more suitable elements. Several authors— I leave 
aside all questions of priority— have proposed that we should 
use the derivatives oi the Gaussian or “normal” error-function. 
Series of this form, that is 

00 (v 

f(x)= S a„D^e zk- , (5) 


are usually called ./i -series. 

For cases presenting an exceptional degree of skewness, 
Charlier has proposed another series, proceeding by differences 
of Poisson’s frequency-function. This function is originally only 
defined for non-negative, integral values of the argument v, viz. 

= ( 6 ) 

But from this, Charlier constructs a continuous function ^ (x), 
defined for all, positive or negative, values of x, putting 

^ (a;) = — I e* * cos («: sin t — xt)dt (7) 

It is easily verified that for f, ift (x) assumes the values (6)^. 
If, now, V denotes the ascending difference, that is 


Charlier puts 


which is his so-called JS-series. 

The question naturally presents itself, under what circum- 
stances the A- and .B-series may reasonably be said to represent 
frequency-distributions. For it is perfectly clear that the series 
(5) and (8) as they stand, with an infinity of terms, need have 
nothing whatever to do with frequency-distributions, any more 
than a trigonometric series need represent a function with 
properties similar to those of its individual terms. They are 


♦ ; Writing i 
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simply ' series representing more or less general classes of func- 
tions. Whether such expressions as (5) and (8) deserve the name 
of frequency-functions ” depends entirely on whether they may- 
be interpreted as probabilities or not. ■ 

W are -therefore of opinion that the labour that has been 
expended by several authors in examining the conditions under 
which ' -the -series is ultimately convergent, interesting as it is 
from the point of view of mathematical analysis,, has no bearing 
on .the.question of the statistical 'app.Iications.,; ''What we want to 
know is, whether the first three or, four 'terms of the series can 
be looked upon as a frequency-function, in which case we are 
also sure that our function only depends on a limited number of 
constants to be determined by the experience. That an expression 
containing an' unlimited number' of constants can be made to fit 
a limited experience, is neither surprising nor interesting. ■ ^ ' 

There are other drawbacks to development in an infinite 
series.'' If, as is usually the case, the coefficients are determined 
by the method of moments,, the validity of inverting the order 
in; which the summations or integrations are performed must be 
proved. This- difficulty does not exist, if we look at' the problem 
merely as a question of graduation by means of a formula, con- 
taining 2. finite number of constants. But then, as in all questions 
of graduation, it becomes; imperative , to . have- good reasons to 
believe that the fomiula,- chosen will fit the observations. And 
this means, in our case, first of all that the formula must 
represent a probability, 

There is a special objection to the ■ use; of the continuous 
function p(x) introduced, by, Charlier,. .-and - that is that the 
moments of ip (x) taken between the .limits' ' i, co are divergent^ 
as I have shown elsewdiere^. It seems, altogether unnatural to 
generalize ^(i^) as defined by (6) into- the continuous function (7), 
defined for all real values;: of, 'W While ^ ,(x^) is a well-defined 
probability, p (x) cannot generally be interpreted as a probability, 

sin * 7 tx • 

as for small values of (^);, appro'aches.to- which can 
assume, negative., values, . 

' 3. I shall now, "avoiding the use of infinite series, proceed to 
investigate under what circumstances a finite number of terms 
of an .4-series or a B-series may be looked upon as a frequency- 
function, i.e. a probability, and how the constants may be deter- 
mined by the method of moments. I take the terms ^-series ” 
and ''B-series” in a somewhat extended sense, meaning by an 


^ Svenska Aktuaneforeningens Tidskrift (1916), pp. 226-228. 
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‘^<4-se^ies a series that proceeds by differential coefKeients, and 
by a * ** B-series ” a series that proceeds by differences,^ of a given 
frequency-function which I shall not specify. Only it must be 
understood that the frequency-function thus , employed , as 
generating element must be a genuine frequency-function of an' 
elementary type, that is^, a probability depending on a parameter,: 
so that its ' Value is always positive (or zero). I have, in fact, 
dealt with this question beforef , but from a less general point 
of view, having observed- later on that, it is possible to treat the 
J!-series and B-series simultaneously. 

Let the generating function ht p ix). This function may, 
according to circumstances, be a discontinuous frequency- 
function, such as the binomial function, Poisson’s function, etc. ; 
or it may be a continuous frequency-function, such as the 
Gaussian function, one of Pearson’s types, etc. 

It is then clear that the expression 


* 

/ (x) Ap^{x — VO)), (9) 

v~o 
k ■ 

where 2 ^^=!, (10) 

v =» o 


is a frequency-function, provided only that the constants 
without being necessarily all positive, are such thsLtf{x) does 
not assume negative values. For it follows from (10) thatj 
S / (^) = I in the case of discontinuous functions , and i 

in the case of continuous functions. And a linear function of 
probabilities may, under these circumstances, evidently always 
be interpreted as a probability. 

4 . Before continuing, it will be suitable to recapitulate the 
notation which will be used as regards moments and similar 
functions. 

If there are n repeated experiments or observations Of, ^ve 
write, as in the First Lecture, 

= i So/. ......(o) 

' ■ ■■ ' 

* If continuous, after multiplication by dx. 

t On Charlier’s Generalized Frequency-Function. Skandinavisk Aktuarie- 
tidskrift (1924), pp. 147-152. 

t In lea\dng out the limits of summation and integration we indicate that 
the summation or integration is extended to the whole range of existence of 
the frequency- toction. This range may always be taken as ±00, if the 
frequency-function is defined as zero outside the range for which it is em- 
ployed, a very convenient convention. It is possible to include sums and 
integrals in one formula, but only by modifying the usual definition of an 
integral in a way with which the actuarial world is not yet familiar (Stieltjes' 
Integral) and which we would therefore rather avoid. 
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The correspori.ding true values are, according to circumstances, 
calculated as 

ar = f {x) or Gy — f {x) dx (12) 

It is, of course, assumed that the nature of the function ^ 
used for representing / (.a^) as shown in (9), is such that the 
moments (12), up to the order actually employed, are convergent. 
The moments (about the origin) belonging to ^ {x) will be 

■ denoted by O'/? that is\ , 

Uy = S {x) or a/ = fx^ (f>(x)dx (13) 

■ according, to circumstances. 

Finally, it will be convenient to introduce the notation 

cr/' = S (14) 

17 — 0 

although A,, is not a frequency-function, but may assume nega- 
tive values, so that there is only a formal analogy with 5 =^ and 
The different kinds of moments about the origin having been 
thus defined, it is hardly necessary to explain what must be 
understood by the corresponding moments about the mean, 
semi-invariants, etc. For instance, the relation connecting the 
semi-invariants with the moments about the mean, that is 




•(IS) 


also holds if we replace a and ft by cr and ft, by cr' and ft', or by 
cr" and ft". The same applies to the relation 

ilz 


2,!. 


CTt or^ 

i! 2! 


...(16) 


which is equivalent with (15). 

A few vrords are advisable about the factorial moments^ ^ an 
interesting class of functions which have not yet come into such 
general use as they deserve. The rth factorial moment is denoted 
by a(r) and defined by 


^(r) • 




Ho/-) 




•(17) 


** These functions are, in principle, already found in W. Palin Elderton s 
Frequency -Curves and Correlation (1906), p. 20, or (1927) P- 20. A more 
systematic treatment of them is due to W. F. Sheppard, see his paper 
Factorial Moments in Terms of Sums or Differences” {Proc, of the London 
Math, Soc. Ser. 2, vol. xni, Part 2). The notation used below is that of my 
paper Factorial Moments and Discontinuous Frequency-Functions” 
{Skandinavisk Aktuarietidskrift, 1923, p. 73). 
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r 

As* 5cW= 

r 

we have == 2 ; ......(i8) 

and inversely, from 

x»-== , 

„=0 vl . 

r 

we find o-y = S cr(„) — ' j” , (19) 

SO that the functions a'(;.) and cr,. are equivalentf. 

The relations (18) and (19) evidently hold, if a is replaced by 
o', a' or o-"'. The function ?(,.) is defined by 

d(r) = 'EiX^^'>f(x) or a(r)^ ! (x) dx, (20) 

while, in the case of o-(y)', ^ (x) takes the place of f (x). For 
(T(r)' we have 

(21) 

The factorial moments about the mean are denoted by and 
defined by 

?%) = ^ s (Pi-mty\ (r >1), (22) 

while 7 n(i) == . Developing the right-hand side in powers of 

(o; — mj), we have 

The functions m(y)y etc., are thus equivalent. 

The relation (23) evidently holds, if m is replaced by #2, m' or 

■'/The functions and. V are connected by an important 
relation which may be obtained as follows. Let us, for instance, 
assume that the frequency-function is continuous, (the same 
argument may be followed step by step, /if it is discontinuous) ; 
we have, then, 

= J x^f{oc)dx 

' ^ ■ 

— vw) dx; 

■ v«^o 

^ Interpolation^ f S. 

f Tables of the coefficients in (18) and (19) are, for instance, found in 

Interpolation^ § 6. 
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butj by , the bimmial theorem^ 

lA 


f = S (^j vmy (vcoY"'^ ; 


jr=o 

■k r , /^\ 

hence = S S I j (i^coY'^^f (x — vcjoY ^ {x — vco) dx 

lfz=o s=>0 

k 


V=aO 5=0 


K=ri J ws=tn 


■or 


S = o V/ 


.(24) 


valid both for continuous and discontinuous frequency- 
functions. 

Although this relation is quite simple, it is often more 
advantageous to employ the still simpler relation between the 
corresponding semi-invariants, or 

In order to obtain this, we multiply the relation 


•(25) 


1 1 2 ! 


by the relation 

l! 




cat+^~(aH'=‘ + ... 


^ = I -[- cat + ca^t^+ 


j Wf 1 

I ! 2 ! 

The coefScient of f in the exponent on the left then becomes 
"q )» 

while the coefficient of f on the right becomes 

— r~,a)*^4' -j — -r,, —7 of ... + ”7 “ 7 . 

rl (r — i)! I ! r! ? ! 


As 


I 'I , zt 


...we. therefore .have 


CT T o 


or (25). 
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5« After, these preHminaries we, proceed to transform the 
frequency-fiinctioii/ (^), defined by ( 9 ). 

Making use of the operators^ jB*® and V defined by 


E^F{x) = Fix + a), yFix) = 


SO that 
we have 


I - 


E-^ = j — coV > 


..( 26 ) 


or, developing the binomial, 

cf>{x~vco)=^i(-iy('')io^y^(i>{x) .(27) 

^=0 \S / M 

Inserting this in (9), we have 


/(ic) = S S (- i)* 


|^=:0 5—0 


^ {x). 


As vanishes for s> p, we may write S instead of 2 

\'5' / O';,' ' ■ 5 — O 

and then, changing the order of summation, we obtain 


fix) = i (x) S A,. 

$=o S * c*> v"0 


But 

so that finally 


2 2 

v—o v=s 


f{x)= S ^ — p a>®<T{j)" V® (jc) (28) 


In order to represent a frequency-distribution by this 
formula, we first calculate the juy which are obtained approxi- 
mately from the observations, as mentioned in the First Lecture. 
Next, the fi/ are calculated by means of the given function 
<f> (x). We may then calculate the by the formula 


// H'r ^ 

of 


:P'r':/= : —-(29) 

resulting from (25), and finally the 0^)" which are obtained 


^ For the elements of the Calculus of Symbols we may, for instance, refer 
to Steffensen, Interpolation^ §§ 2 and 18. 
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from the ftr" by the same formulas which connect the with 
thefr, that is {15) and (18), or 


— P'1 

,== PjO'j + pz ■ 

' a^ ==^p,(Jz + 2Pz(^i-r:Pz V 
0*4 ^ pl^3 3p3^^ p4 


•(30) 


■ and 


O’(l) — CTj 
<T{2) == CT^ CTj; 

o-Ca) - o's - 3^z + 20-1 

o'(4) = 0’4 — bag + I la^ — ba. 


■(31) 


6. The frequency-function (28) is evidently nothing but the 
first k + I terms of a generalized B-series 


S c,V'cf>(x). 


•(32) 


We have found the necessary and sufficient conditions which must 
be satisfied in order that the first k + 1 terms of (32), 

f{x) = S c,Vr<f> (x), (33) 

shall be a frequency-function. These conditions are that/(Aj) 
does not assume negative values, and that Cq — i. For if/(^) 
is positive and == i, that is 

k 

^(o) ^ **"• I 5 

■ V— o 

and only then, the expression (9) represents a frequency- 

function. .1 * 

If, in (28), we put a> == I, we have the usual B-type with unit 

'Interval.' 

^(-xy 


•(34) 

■•( 35 ) 


/(*)= S 

s— o ; , 

where. JL^ — pr, ... . 

If, on the other hand, we let a> -> o, we have 
lim {x) = {x)y 

so that (28) contains the A-type as a limiting case. But the 
calculation of the coefficients requires special attention, as we 


■(36) 
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have, in. the preceding considerations, for instance in (29), 
assumed w + o. 

Let us, for a moment, put 

, '(37) 

we obtain then, from (24), putting s = r — v, 

......(38) 


CTy — Zj ! iTyCr 

V =0 \”/ 


showing that the r^. exist for co o, provided the and o*/ do. 

Now it follows from (i8), if we write a' instead of a, and 
multiply on both sides by that 

r 2)PQ{r) 

Ca^G(r)^ = S — 


Hence we have, for <jo -> o, 

T,.= lim 


We therefore obtain from (28), for co -> o, 

/(*) 


5=0 ^ • 


We may finally drop the r-notation ; for (38) shows that the 
are calculated from the and aj by the same relations as the 
aj' for m = I. If, therefore, p./' is calculated by 

[Ir ^ jlr '-' fX>r 9 ( 39 ) 

replacing (29) for a> o, we may write f(x) in the form 

/(4=2 (40) 

■ Formula (40) is the . 4 -type which has thus, together with the 
i?-type, been derived from the common source (28). 

7 , In practice, the question of applying (28) will usually arise 
under circumstances , where an attempt has been made ' to 
represent the data by some particular frequency-function (^’), 
but the agreement is not found entirely satisfactory. In that 
case, if (f> {x) depends on k constants which have been deter- 
mined by the method of moments, we have /!,.= for 
r = 1,2, ... /c- For the same values of r we therefore have, by 
(29), o, and hence, by (30) and (31), cr(r/' = o, so that 
the corresponding terms in (28) drop out. It is, at any rate. 
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always advisable to have the same mean for ({> (x) and / (^), that 
iSj cr/ == 0=15 or 0/' ~ o. From this follows 0(5.)" = OT(,.)" for 
r > 1 5 so that (28) may be written 

/(x) ==<^(.v) + S ......(41) 

■ 5=2 SI ■ ■ w , 

Under the same circumstances the B~type becomes: , 

/(x) = ^(^)+ 2 (*), .. ....(42) 

S=s2 S . 

and the ^-type 


fix) ^<f>(x)+i (x). ......(43) 

5=2 S. 

It is of course important to remember that these three for- 
mulas are only valid if/(^) and ^ (^) have the same mean, that 
is Mjr == m/. 

For use in connection with these formulas we give below the 
formulas for calculating the six first semi-invariants and factorial 
moments about the mean, if the moments about the mean are 
known : 


fii — ifii ; ^2 } 1^3 ^3 

= ntg — lontim^ 

JU.6 = m 6 — i S^n^nis — lomf + , 

tn(j) == = ?«* 

?K( 3 ) = m3 - 3^2 

m(4) — m^ — (m^ + iim* 

»Z(5) = OTj - xom^ + 3SZK3 — SOOTa 

m( 6 ) = me — 1 5^5 + 85 m 4 — aaswzj + 274 »Jz 


(44) 


...( 45 ) 


If the frequency-function is continuous, and we use for <f> (x) 
the normal error-function, the calculation of the coefficients is 
simplified, as 0= fij' = iJ,f = .... We shall not go into this 
well-known question. 


8. Our frequency-function (28) is, on the conditions indicated 
above, a genuine frequency-function, not a mere approximation 
to one. Yet it has the form of a number of terms of a series and 
has the advantage that if the fit is not found satisfactory, another 
term may be added to those present, without it being necessary 
to start the calculations all over again. Further, the calculation 
of the constants is as simple as may be desired. 
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There are, however, considerable drawbacks. We cannot, as 
with Pearson’s types, be sure beforehand that negative values will 
not occur; as a matter of fact they often do occur, and this can 
only be ascertained at The end of the calculation. We have not 
even very good reason to expect, that by adding' another , term 
such negative values may .be made to disappear. , It was, further- 
more, pointed out by E^eworth that the terms in ' the Yf-series, 
are not arranged according to their order of magnitude. The 
re-arranged .2-series involves . an . entirely different ■ problem 
which has been studied recently by PL Cramer in a very 
thorough paper to which I would like to call attention^'. 

It may finally be observed that the frequency-function (28) 
often presents several maxima and minima. This may be an 
advantage if the experience also does so; but then, such an 
experience is often of little value, as the presence of maxima and 
minima is, perhaps, due to the fact that the material is not 
homogeneous, or too small. We are therefore inclined to think 
that the apparent generality of (28) is rather a disadvantage than 
otherwise, and that Pearson’s types are as a rule preferable. 

H. Cramer, “On the composition of elementary errors/* Skandinavisk 
Aktuarietidskrift (1928), pp. 13-74 PP* 141-180. Some applications of 
the theory are given in a paper by the same author, “ On the Mathematical 
Theory of Risk/* pp. 48-65, published in Fdrsdkringsaktiebolaget Skandiat 
1855-1930, II, Stockholm, 1930. 


NOTATION 


As the author’s notation for moments and similar frequency 
constants differs from the usual English notation*, it has been 
found advisable to give the following summary of the author’s 
notation : 

rth moment about origin; o,. 

rth „ ,, mean: Wr, for r>i. 

For r—i the author defines nii — cTi, 

rth factorial moment about origin: a^). 

rth „ „ » mean: ?n(„,forr>i. 

For r = i the author defines mo) = a(i). 

rth semi-invariant : Mr • 

All five symbols refer to a total frequency of unity. 

If the moments are not “ observed values ” but so-called “ trae 
values,” a bar is placed above the leading letter, thus: 

^(r)> ^(r)> 

Moments referring to the function <j> {x) are, however, dis- 
tinguished by a dash : 

Ct/, ot/, /X,/, 

and moments referring to , considered as a function of by 
a double-dash: 

or/, > H'r * 

^ In the usual English notation 

vr is used for the author’s mr, 

lly ,* ff if »> * 

and a is used for the standard deviation* 
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